Multimodal Skipgram Using Convolutional Pseudowords
نویسندگان
چکیده
This work studies the representational mapping across multimodal data such that given a piece of the raw data in one modality the corresponding semantic description in terms of the raw data in another modality is immediately obtained. Such a representational mapping can be found in a wide spectrum of real-world applications including image/video retrieval, object recognition, action/behavior recognition, and event understanding and prediction. To that end, we introduce a simplified training objective for learning multimodal embeddings using the skip-gram architecture by introducing convolutional “pseudowords:” embeddings composed of the additive combination of distributed word representations and image features from convolutional neural networks projected into the multimodal space. We present extensive results of the representational properties of these embeddings on various word similarity benchmarks to show the promise of this approach.
منابع مشابه
Dependency Based Embeddings for Sentence Classification Tasks
We compare different word embeddings from a standard window based skipgram model, a skipgram model trained using dependency context features and a novel skipgram variant that utilizes additional information from dependency graphs. We explore the effectiveness of the different types of word embeddings for word similarity and sentence classification tasks. We consider three common sentence classi...
متن کاملModeling the Visual Word Form Area Using a Deep Convolutional Neural Network
The visual word form area (VWFA) is a region of the cortex located in the left fusiform gyrus, that appears to be a waystation in the reading pathway. The discovery of the VWFA occurred in the late twentieth century with the advancement of functional magnetic resonance imaging (fMRI). Since then, there has been an increasing number of neuroimaging studies to understand the VWFA, and there are d...
متن کاملVisually Grounded and Textual Semantic Models Differentially Decode Brain Activity Associated with Concrete and Abstract Nouns
Important advances have recently been made using computational semantic models to decode brain activity patterns associated with concepts; however, this work has almost exclusively focused on concrete nouns. How well these models extend to decoding abstract nouns is largely unknown. We address this question by applying state-of-the-art computational models to decode functional Magnetic Resonanc...
متن کاملMultimodal MRI brain tumor segmentation using random forests with features learned from fully convolutional neural network
In this paper, we propose a novel learning based method for automated segmentation of brain tumor in multimodal MRI images. The machine learned features from fully convolutional neural network (FCN) and hand-designed texton features are used to classify the MRI image voxels. The score map with pixelwise predictions is used as a feature map which is learned from multimodal MRI training dataset u...
متن کاملBenchmarking Multimodal Sentiment Analysis
We propose a framework for multimodal sentiment analysis and emotion recognition using convolutional neural network-based feature extraction from text and visual modalities. We obtain a performance improvement of 10% over the state of the art by combining visual, text and audio features. We also discuss some major issues frequently ignored in multimodal sentiment analysis research: the role of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1511.04024 شماره
صفحات -
تاریخ انتشار 2015